44 research outputs found

    The Complex Genetic Architecture of the Metabolome

    Get PDF
    Discovering links between the genotype of an organism and its metabolite levels can increase our understanding of metabolism, its controls, and the indirect effects of metabolism on other quantitative traits. Recent technological advances in both DNA sequencing and metabolite profiling allow the use of broad-spectrum, untargeted metabolite profiling to generate phenotypic data for genome-wide association studies that investigate quantitative genetic control of metabolism within species. We conducted a genome-wide association study of natural variation in plant metabolism using the results of untargeted metabolite analyses performed on a collection of wild Arabidopsis thaliana accessions. Testing 327 metabolites against >200,000 single nucleotide polymorphisms identified numerous genotype–metabolite associations distributed non-randomly within the genome. These clusters of genotype–metabolite associations (hotspots) included regions of the A. thaliana genome previously identified as subject to recent strong positive selection (selective sweeps) and regions showing trans-linkage to these putative sweeps, suggesting that these selective forces have impacted genome-wide control of A. thaliana metabolism. Comparing the metabolic variation detected within this collection of wild accessions to a laboratory-derived population of recombinant inbred lines (derived from two of the accessions used in this study) showed that the higher level of genetic variation present within the wild accessions did not correspond to higher variance in metabolic phenotypes, suggesting that evolutionary constraints limit metabolic variation. While a major goal of genome-wide association studies is to develop catalogues of intraspecific variation, the results of multiple independent experiments performed for this study showed that the genotype–metabolite associations identified are sensitive to environmental fluctuations. Thus, studies of intraspecific variation conducted via genome-wide association will require analyses of genotype by environment interaction. Interestingly, the network structure of metabolite linkages was also sensitive to environmental differences, suggesting that key aspects of network architecture are malleable

    Genetic Networks Controlling Structural Outcome of Glucosinolate Activation across Development

    Get PDF
    Most phenotypic variation present in natural populations is under polygenic control, largely determined by genetic variation at quantitative trait loci (QTLs). These genetic loci frequently interact with the environment, development, and each other, yet the importance of these interactions on the underlying genetic architecture of quantitative traits is not well characterized. To better study how epistasis and development may influence quantitative traits, we studied genetic variation in Arabidopsis glucosinolate activation using the moderately sized BayreuthΓ—Shahdara recombinant inbred population, in terms of number of lines. We identified QTLs for glucosinolate activation at three different developmental stages. Numerous QTLs showed developmental dependency, as well as a large epistatic network, centered on the previously cloned large-effect glucosinolate activation QTL, ESP. Analysis of Heterogeneous Inbred Families validated seven loci and all of the QTLΓ—DPG (days post-germination) interactions tested, but was complicated by the extensive epistasis. A comparison of transcript accumulation data within 211 of these RILs showed an extensive overlap of gene expression QTLs for structural specifiers and their homologs with the identified glucosinolate activation loci. Finally, we were able to show that two of the QTLs are the result of whole-genome duplications of a glucosinolate activation gene cluster. These data reveal complex age-dependent regulation of structural outcomes and suggest that transcriptional regulation is associated with a significant portion of the underlying ontogenic variation and epistatic interactions in glucosinolate activation

    Mapping transcription mechanisms from multimodal genomic data

    Get PDF
    Background Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data. Results We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate. Conclusions The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.National Human Genome Research Institute (U.S.) (R01HG003354)National Institute of Allergy and Infectious Diseases (U.S.) (U19 AI067854-05)National Heart, Lung, and Blood Institute (grant T32 HL007427-28)National Institutes of Health (U.S.) (grant K99 LM009826

    Genomic Analysis of QTLs and Genes Altering Natural Variation in Stochastic Noise

    Get PDF
    Quantitative genetic analysis has long been used to study how natural variation of genotype can influence an organism's phenotype. While most studies have focused on genetic determinants of phenotypic average, it is rapidly becoming understood that stochastic noise is genetically determined. However, it is not known how many traits display genetic control of stochastic noise nor how broadly these stochastic loci are distributed within the genome. Understanding these questions is critical to our understanding of quantitative traits and how they relate to the underlying causal loci, especially since stochastic noise may be directly influenced by underlying changes in the wiring of regulatory networks. We identified QTLs controlling natural variation in stochastic noise of glucosinolates, plant defense metabolites, as well as QTLs for stochastic noise of related transcripts. These loci included stochastic noise QTLs unique for either transcript or metabolite variation. Validation of these loci showed that genetic polymorphism within the regulatory network alters stochastic noise independent of effects on corresponding average levels. We examined this phenomenon more globally, using transcriptomic datasets, and found that the Arabidopsis transcriptome exhibits significant, heritable differences in stochastic noise. Further analysis allowed us to identify QTLs that control genomic stochastic noise. Some genomic QTL were in common with those altering average transcript abundance, while others were unique to stochastic noise. Using a single isogenic population, we confirmed that natural variation at ELF3 alters stochastic noise in the circadian clock and metabolism. Since polymorphisms controlling stochastic noise in genomic phenotypes exist within wild germplasm for naturally selected phenotypes, this suggests that analysis of Arabidopsis evolution should account for genetic control of stochastic variance and average phenotypes. It remains to be determined if natural genetic variation controlling stochasticity is equally distributed across the genomes of other multi-cellular eukaryotes

    Magnitude and Timing of Leaf Damage Affect Seed Production in a Natural Population of Arabidopsis thaliana (Brassicaceae)

    Get PDF
    Background: The effect of herbivory on plant fitness varies widely. Understanding the causes of this variation is of considerable interest because of its implications for plant population dynamics and trait evolution. We experimentally defoliated the annual herb Arabidopsis thaliana in a natural population in Sweden to test the hypotheses that (a) plant fitness decreases with increasing damage, (b) tolerance to defoliation is lower before flowering than during flowering, and (c) defoliation before flowering reduces number of seeds more strongly than defoliation during flowering, but the opposite is true for effects on seed size. Methodology/Principal Findings: In a first experiment, between 0 and 75% of the leaf area was removed in May from plants that flowered or were about to start flowering. In a second experiment, 0, 25%, or 50% of the leaf area was removed from plants on one of two occasions, in mid April when plants were either in the vegetative rosette or bolting stage, or in mid May when plants were flowering. In the first experiment, seed production was negatively related to leaf area removed, and at the highest damage level, also mean seed size was reduced. In the second experiment, removal of 50% of the leaf area reduced seed production by 60% among plants defoliated early in the season at the vegetative rosettes, and by 22% among plants defoliated early in the season at the bolting stage, but did not reduce seed output of plants defoliated one month later. No seasonal shift in the effect of defoliation on seed size was detected. Conclusions/Significance: The results show that leaf damage may reduce the fitness of A. thaliana, and suggest that in this population leaf herbivores feeding on plants before flowering should exert stronger selection on defence traits than those feeding on plants during flowering, given similar damage levels

    Genetic Networks of Liver Metabolism Revealed by Integration of Metabolic and Transcriptional Profiling

    Get PDF
    Although numerous quantitative trait loci (QTL) influencing disease-related phenotypes have been detected through gene mapping and positional cloning, identification of the individual gene(s) and molecular pathways leading to those phenotypes is often elusive. One way to improve understanding of genetic architecture is to classify phenotypes in greater depth by including transcriptional and metabolic profiling. In the current study, we have generated and analyzed mRNA expression and metabolic profiles in liver samples obtained in an F2 intercross between the diabetes-resistant C57BL/6 leptinob/ob and the diabetes-susceptible BTBR leptinob/ob mouse strains. This cross, which segregates for genotype and physiological traits, was previously used to identify several diabetes-related QTL. Our current investigation includes microarray analysis of over 40,000 probe sets, plus quantitative mass spectrometry-based measurements of sixty-seven intermediary metabolites in three different classes (amino acids, organic acids, and acyl-carnitines). We show that liver metabolites map to distinct genetic regions, thereby indicating that tissue metabolites are heritable. We also demonstrate that genomic analysis can be integrated with liver mRNA expression and metabolite profiling data to construct causal networks for control of specific metabolic processes in liver. As a proof of principle of the practical significance of this integrative approach, we illustrate the construction of a specific causal network that links gene expression and metabolic changes in the context of glutamate metabolism, and demonstrate its validity by showing that genes in the network respond to changes in glutamine and glutamate availability. Thus, the methods described here have the potential to reveal regulatory networks that contribute to chronic, complex, and highly prevalent diseases and conditions such as obesity and diabetes

    What Can Causal Networks Tell Us about Metabolic Pathways?

    Get PDF
    Graphical models describe the linear correlation structure of data and have been used to establish causal relationships among phenotypes in genetic mapping populations. Data are typically collected at a single point in time. Biological processes on the other hand are often non-linear and display time varying dynamics. The extent to which graphical models can recapitulate the architecture of an underlying biological processes is not well understood. We consider metabolic networks with known stoichiometry to address the fundamental question: β€œWhat can causal networks tell us about metabolic pathways?”. Using data from an Arabidopsis BaySha population and simulated data from dynamic models of pathway motifs, we assess our ability to reconstruct metabolic pathways using graphical models. Our results highlight the necessity of non-genetic residual biological variation for reliable inference. Recovery of the ordering within a pathway is possible, but should not be expected. Causal inference is sensitive to subtle patterns in the correlation structure that may be driven by a variety of factors, which may not emphasize the substrate-product relationship. We illustrate the effects of metabolic pathway architecture, epistasis and stochastic variation on correlation structure and graphical model-derived networks. We conclude that graphical models should be interpreted cautiously, especially if the implied causal relationships are to be used in the design of intervention strategies

    Genotype and Gene Expression Associations with Immune Function in Drosophila

    Get PDF
    It is now well established that natural populations of Drosophila melanogaster harbor substantial genetic variation associated with physiological measures of immune function. In no case, however, have intermediate measures of immune function, such as transcriptional activity of immune-related genes, been tested as mediators of phenotypic variation in immunity. In this study, we measured bacterial load sustained after infection of D. melanogaster with Serratia marcescens, Providencia rettgeri, Enterococcus faecalis, and Lactococcus lactis in a panel of 94 third-chromosome substitution lines. We also measured transcriptional levels of 329 immune-related genes eight hours after infection with E. faecalis and S. marcescens in lines from the phenotypic tails of the test panel. We genotyped the substitution lines at 137 polymorphic markers distributed across 25 genes in order to test for statistical associations among genotype, bacterial load, and transcriptional dynamics. We find that genetic polymorphisms in the pathogen recognition genes (and particularly in PGRP-LC, GNBP1, and GNBP2) are most significantly associated with variation in bacterial load. We also find that overall transcriptional induction of effector proteins is a significant predictor of bacterial load after infection with E. faecalis, and that a marker upstream of the recognition gene PGRP-SD is statistically associated with variation in both bacterial load and transcriptional induction of effector proteins. These results show that polymorphism in genes near the top of the immune system signaling cascade can have a disproportionate effect on organismal phenotype due to the amplification of minor effects through the cascade

    Quantitative and Qualitative Stem Rust Resistance Factors in Barley Are Associated with Transcriptional Suppression of Defense Regulons

    Get PDF
    Stem rust (Puccinia graminis f. sp. tritici; Pgt) is a devastating fungal disease of wheat and barley. Pgt race TTKSK (isolate Ug99) is a serious threat to these Triticeae grain crops because resistance is rare. In barley, the complex Rpg-TTKSK locus on chromosome 5H is presently the only known source of qualitative resistance to this aggressive Pgt race. Segregation for resistance observed on seedlings of the Q21861 Γ— SM89010 (QSM) doubled-haploid (DH) population was found to be predominantly qualitative, with little of the remaining variance explained by loci other than Rpg-TTKSK. In contrast, analysis of adult QSM DH plants infected by field inoculum of Pgt race TTKSK in Njoro, Kenya, revealed several additional quantitative trait loci that contribute to resistance. To molecularly characterize these loci, Barley1 GeneChips were used to measure the expression of 22,792 genes in the QSM population after inoculation with Pgt race TTKSK or mock-inoculation. Comparison of expression Quantitative Trait Loci (eQTL) between treatments revealed an inoculation-dependent expression polymorphism implicating Actin depolymerizing factor3 (within the Rpg-TTKSK locus) as a candidate susceptibility gene. In parallel, we identified a chromosome 2H trans-eQTL hotspot that co-segregates with an enhancer of Rpg-TTKSK-mediated, adult plant resistance discovered through the Njoro field trials. Our genome-wide eQTL studies demonstrate that transcript accumulation of 25% of barley genes is altered following challenge by Pgt race TTKSK, but that few of these genes are regulated by the qualitative Rpg-TTKSK on chromosome 5H. It is instead the chromosome 2H trans-eQTL hotspot that orchestrates the largest inoculation-specific responses, where enhanced resistance is associated with transcriptional suppression of hundreds of genes scattered throughout the genome. Hence, the present study associates the early suppression of genes expressed in this host–pathogen interaction with enhancement of R-gene mediated resistance

    Metabolic Profiling of a Mapping Population Exposes New Insights in the Regulation of Seed Metabolism and Seed, Fruit, and Plant Relations

    Get PDF
    To investigate the regulation of seed metabolism and to estimate the degree of metabolic natural variability, metabolite profiling and network analysis were applied to a collection of 76 different homozygous tomato introgression lines (ILs) grown in the field in two consecutive harvest seasons. Factorial ANOVA confirmed the presence of 30 metabolite quantitative trait loci (mQTL). Amino acid contents displayed a high degree of variability across the population, with similar patterns across the two seasons, while sugars exhibited significant seasonal fluctuations. Upon integration of data for tomato pericarp metabolite profiling, factorial ANOVA identified the main factor for metabolic polymorphism to be the genotypic background rather than the environment or the tissue. Analysis of the coefficient of variance indicated greater phenotypic plasticity in the ILs than in the M82 tomato cultivar. Broad-sense estimate of heritability suggested that the mode of inheritance of metabolite traits in the seed differed from that in the fruit. Correlation-based metabolic network analysis comparing metabolite data for the seed with that for the pericarp showed that the seed network displayed tighter interdependence of metabolic processes than the fruit. Amino acids in the seed metabolic network were shown to play a central hub-like role in the topology of the network, maintaining high interactions with other metabolite categories, i.e., sugars and organic acids. Network analysis identified six exceptionally highly co-regulated amino acids, Gly, Ser, Thr, Ile, Val, and Pro. The strong interdependence of this group was confirmed by the mQTL mapping. Taken together these results (i) reflect the extensive redundancy of the regulation underlying seed metabolism, (ii) demonstrate the tight co-ordination of seed metabolism with respect to fruit metabolism, and (iii) emphasize the centrality of the amino acid module in the seed metabolic network. Finally, the study highlights the added value of integrating metabolic network analysis with mQTL mapping
    corecore